Skip to content

fix(jdbc): prevent BigDecimal DoS in StringUtils.consistentToString#17508

Open
Young-Leo wants to merge 2 commits intoapache:masterfrom
Young-Leo:fix/jdbc-bigdecimal-dos
Open

fix(jdbc): prevent BigDecimal DoS in StringUtils.consistentToString#17508
Young-Leo wants to merge 2 commits intoapache:masterfrom
Young-Leo:fix/jdbc-bigdecimal-dos

Conversation

@Young-Leo
Copy link
Copy Markdown
Contributor

Summary

A BigDecimal constructed from extreme scientific notation (e.g. new BigDecimal("1e1000000000")) has scale = -1_000_000_000. Calling toPlainString() on such a value materializes a plain-decimal string of ~1 × 10⁹ characters (~2 GB), exhausting the JDBC client JVM heap and causing a denial of service.

This path is reachable from user-controlled input via:

IoTDBPreparedStatement.setObject(idx, maliciousBigDecimal, Types.VARCHAR);

which routes through StringUtils.consistentToString(BigDecimal) → reflective BigDecimal.toPlainString().

Reproducer (before the fix)

BigDecimal d = new BigDecimal("1e1000000000");
StringUtils.consistentToString(d); // → OutOfMemoryError

Observed on JDK 8, -Xmx512m:

Input scale Before fix After fix
1e1000 -1000 1001 chars, ~30 ms 1001 chars, ~28 ms (unchanged)
1e100000000 -100 000 000 OutOfMemoryError (swallowed by catch) 12 chars, ~20 ms (scientific notation)
1e1000000000 (PoC) -1 000 000 000 OutOfMemoryError 13 chars "1E+1000000000", ~17 ms

Note: because InvocationTargetException also wraps OutOfMemoryError, the existing catch block can silently swallow the OOM on small-heap clients and fall back to toString(). On larger heaps the allocation succeeds and the client process is OOM-killed. Both behaviors are undesirable.

Fix

Add a scale guard in StringUtils.consistentToString(BigDecimal). When |scale| > 10_000, fall back to BigDecimal.toString() (compact scientific form) instead of invoking toPlainString(). The threshold is orders of magnitude above any realistic decimal precision and is therefore transparent to legitimate data (a 10 000-char plain string is only ~10 KB).

if (decimal.scale() > MAX_PLAIN_STRING_SCALE
    || decimal.scale() < -MAX_PLAIN_STRING_SCALE) {
    return decimal.toString();
}

Also tightens the reflective invocation: toPlainStringMethod.invoke(decimal, (Object[]) null) to remove the varargs-null ambiguity warning.

Why fix it here (and not in IoTDBPreparedStatement)

StringUtils.consistentToString is the only caller of BigDecimal.toPlainString() in the JDBC driver. Fixing it at the root closes every existing and future call site with a single change, rather than duplicating the check in setObject for CHAR/VARCHAR/LONGVARCHAR.

Tests

Added StringUtilsTest covering:

  • null input
  • Normal decimal (123.45 and 1E+2"100")
  • Boundary case at MAX_PLAIN_STRING_SCALE (still expanded to plain form)
  • Regression for the PoC payload 1e1000000000 (must return compact scientific form, not a 1 GB string)
  • Symmetric case for huge positive scale (1e-1000000000)

All 5 tests pass.

Risk / compatibility

  • No API changes.
  • Behavior only differs for pathological values whose plain string would exceed ~10 KB — these inputs previously produced either an OOM (swallowed) or a multi-GB string and could not have been relied on.
  • Normal decimal data is bit-for-bit identical to before.

Credits

Vulnerability reported by:
Ho1aAs, movonow, zhid889773, WA-888, binghuangczc, errors11, Phoexina, skydiver-jay, V9d0g, FlyFish.

A BigDecimal such as new BigDecimal(1e1000000000) has scale -1e9.
Calling toPlainString() on it would materialize a ~1GB String and OOM
the client JVM. This is reachable from IoTDBPreparedStatement.setObject
with targetSqlType in {CHAR, VARCHAR, LONGVARCHAR}, so an attacker that
controls the prepared-statement parameter value can cause a denial of
service on the JDBC client.

Add a scale guard (|scale| <= 10000) before invoking toPlainString. For
any value outside that range, fall back to BigDecimal.toString(), which
is always a few dozen characters. 10000 is vastly larger than any
realistic precision and leaves normal decimal data unaffected.

Also tighten the reflective invocation to pass (Object[]) null to silence
the varargs ambiguity warning.

Adds StringUtilsTest covering null, normal values, the boundary and the
two extreme-scale regressions.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the JDBC client’s StringUtils.consistentToString(BigDecimal) against denial-of-service scenarios where BigDecimal.toPlainString() would expand extreme scientific notation into an enormous plain-decimal string and exhaust heap memory.

Changes:

  • Add a scale-based guard (MAX_PLAIN_STRING_SCALE) to bypass toPlainString() for pathological BigDecimal scales and fall back to toString().
  • Tighten the reflective toPlainString() invocation to avoid varargs-null ambiguity.
  • Add focused unit tests covering normal behavior, boundary behavior, and DoS regressions for huge positive/negative scales.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
iotdb-client/jdbc/src/main/java/org/apache/iotdb/jdbc/StringUtils.java Adds a scale guard to prevent huge toPlainString() expansions; adjusts reflective invocation.
iotdb-client/jdbc/src/test/java/org/apache/iotdb/jdbc/StringUtilsTest.java Introduces regression and boundary tests validating safe behavior for extreme-scale BigDecimal inputs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +51 to +53
* materialize a ~1GB {@link String}, causing {@link OutOfMemoryError} / denial of service on the
* client. Any scale beyond this bound is rejected for expansion and the caller gets the
* scientific form instead, which is a few dozen characters at most.
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Javadoc describes 1e1000000000 as materializing a "~1GB" String, but the impact depends on JVM/string representation (e.g., on JDK 8 a 1e9-character String is closer to ~2GB just for the char array). Consider rewording to an implementation-independent description like "~1e9 characters" / "multi-GB" to keep the documentation accurate across supported Java versions.

Suggested change
* materialize a ~1GB {@link String}, causing {@link OutOfMemoryError} / denial of service on the
* client. Any scale beyond this bound is rejected for expansion and the caller gets the
* scientific form instead, which is a few dozen characters at most.
* materialize a plain decimal string with roughly 1e9 characters, requiring a very large
* (potentially multi-GB) {@link String} allocation and causing {@link OutOfMemoryError} /
* denial of service on the client. Any scale beyond this bound is rejected for expansion and
* the caller gets the scientific form instead, which is a few dozen characters at most.

Copilot uses AI. Check for mistakes.
Comment on lines +55 to +56
* {@link BigDecimal#toPlainString()} to allocate a ~1GB string and OOM the JVM. The guard must
* refuse to expand such values and return the compact scientific form instead.
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test’s Javadoc mentions toPlainString() allocating a "~1GB" string. Since the problematic output is ~1e9 characters and the byte size varies by JVM (e.g., JDK 8 uses 2 bytes/char), consider rephrasing to "~1e9 characters" or "multi-GB" so the comment stays correct across Java versions.

Suggested change
* {@link BigDecimal#toPlainString()} to allocate a ~1GB string and OOM the JVM. The guard must
* refuse to expand such values and return the compact scientific form instead.
* {@link BigDecimal#toPlainString()} to allocate a ~1e9-character string and OOM the JVM. The
* guard must refuse to expand such values and return the compact scientific form instead.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants